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Abstract 

For r a stopping rule adapted to a sequence of n iid observations, we define the loss to be 
E [q(R T )], where Rj is the rank of the jth observation, and q is a nondecreasing function of the 
rank. This setting covers both the best choice problem with q(r) — l(r > 1), and Robbins' 
problem with q(r) = r. As n — > oo the stopping problem acquires a limiting form which is 
associated with the planar Poisson process. Inspecting the limit we establish bounds on the 
stopping value and reveal qualitative features of the optimal rule. In particular, we show that 
the complete history dependence persists in the limit, thus answering a question asked by Bruss 
[3] in the context of Robbins' problem. 
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1. Introduction Let X 1; . . . , X n be a sequence of iid observations, sampled from the 
uniform distribution on [0, 1] (in the setup of this paper this assumption covers the 
general case of arbitrary continuous distribution). For j G [n] := {1, . . . ,n} define 
final ranks as 



k=l 



so (Ri, . . . , R n ) is an equiprobable permutation of [n]. Let q : N — > K + be a nonde- 
creasing loss function with q(l) < q(oo) := supg(r). In 'secretary problems' [20] one 
is typically interested in the large-n behaviour of the minimum risk 

V n (%) = inf E[q(Rr)], (1) 

xGT„ 

where T n is a given class of stopping rules with values in [n]. Two classical loss 
functions are 

(i) q(r) = l(r > 1), for the best-choice problem of maximising the probability of 
stopping at the minimum observation X n \ := min(X 1; . . . , X n ), 

(ii) q(r) = r, for the problem of minimising the expected rank. 

Many results are available for the case where T n in ([1]) is the class TZ n of rank rules, 
which are the stopping rules adapted to the sequence of initial ranks 

j j 

= ipk < *i) = X) ^ Rk ^ R ^ Ci e N). 
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see [SJ [91 [TO] . By independence of the initial ranks, the optimal decision to stop at the 
jth observation depends only on Ij. The limiting risk V^TZ) := lim n _^oo V n (TZ n ) has 
interpretation in terms of a continuous-time stopping problem [10] . Explicit formulas 
for Voo(TZ) are known in some cases, for bounded and unbounded q, including the two 
classical loss functions and their generalisations [21 El El EH EH] • 

Much less explored are the problems where T n is the class T n of all stopping rules 
adapted to the natural filtration (cr(Xi, . . . , Xj), j G [n]). The principal difficulty 
here is that, for general q, the decision to stop on Xj must depend not only on Xj 
but also on the full vector (X,_i 5 i, . . . , Jfj_ij_i) of order statistics of X x , . . . ,Xj-\. 
In this sense, the optimal rule is fully history- dependent. Specifically, the JF„-optimal 
rule has the form 

r n = min{j : Xj < hj(Xj- X ,u ■ ■ ■ > ( 2 ) 

(with h n> i = const, h n ^ n = 1), where (h n j, j e [n]) is a collection of functions with 
certain monotonicity properties. The dependence on history is reducible to the first 
m — 1 order statistics if q is truncated at m: q(r) = q(m) for r > m, but even then 
the analytical difficulties are severe. The asymptotic value V^T) '■— lim^oo V n (T n ) 
is known explicitly only for the best-choice problem (hence for any q truncated at 
m = 2), see [12] for the formula and history. Robbins 7 problem is the problem (jTJ) 
with T n = T n and the linear loss function q{r) = r, see [JJ, El IU E] . 

The full history dependence makes explicit analysis of the ^-optimal rule hardly 
possible, thus it is natural to seek for tractable smaller classes of rules, with some 
kind of reduced dependence on the history. Of course, the rank rules is one of such 
classes, and the optimal rule in lZ n is also of the form (j2j), with the special feature 
that h n j(xi, . . . , Xj-i) = x Ln Q) (for x := < x x < . . . < Xj_ x < 1 and j > 1), where 
t n {j) £ {0, ... ,j — 1} is some threshold value of Ij, and h njX = 0. Another interesting 
possibility is to consider the class Ai n of memoryless rules of the form 

t = min{j : X j < fj}, (3) 

where (f n ,j, j £ \p>]) is an increasing sequence of thresholds. These rules are again 
of the form (T5]), this time with constants in the role functions h n j. By familiar 
monotonicity arguments (which we recall in Section 4) the limiting value V^M) := 
lim^oo V n (A4) (finite or infinite) exists for arbitrary q. See [TH1 [JS] for other classes 
of stopping rules with restricted dependence on history. 

Memoryless rules were intensively studied in the context of Robbins' problem, in 
which case they outperform, asymptotically, the rank rules, meaning that V^M) < 
VoollZ), see [TJ HI [5]. In a recent survey of Robbins' problem Bruss [3] stressed that 
a principal further step would be to either prove or disprove that Voo(jF) < Voo(JA). 
Coincidence of the asymptotic values V^o(JF) = V^M) would imply that history 
dependence of the overall optimal rule were negligible, meaning that deciding about 
some Xj one should essentially focus on the current observation alone. 

In this paper we extend the approach in [TTJ [TJl [T31 [TJ] by establishing that the 
stopping problem in T n has a limiting l n = oo' form based on the planar Poisson 
process. The interpretation of limit risks in terms of the infinite model makes obvious 
the inequality V^T) < V^Ai) for any q provided the values are finite, which is true 
for both the best-choice problem and Robbins' problem. Thus the complexity does 
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not disappear in the limit, and the full history dependence persists. The fmiteness 
is guaranteed if q(r) does not grow too fast, e.g. q(r) < cexp(r^) (0 < (3 < 1) is 
enough. In connection with Robbins' problem, the limiting form was reported by the 
author at the INFORMS Conference on Applied Probability (Atlanta, 14-16 June 
1995), although the Poisson embedding had been exploited earlier [S] in the analysis 
of rank rules. See [T5] for a similar development in the problem of minimising E [X T ]. 

2. A model based on the planar Poisson process Throughout we shall use the 

notation N = NU {oo}? and R + = [0, oo] for the compactified halfline. 

Let V be the scatter of atoms of a homogeneous Poisson point process in the strip 
[0, 1] x R + , with the intensity measure being the Lebesgue measure dtdx. The infinite 
collection of atoms can be labelled (Ti,Xi 7 i), (T 2 , A 12 ), • • • by increase of the second 
component. Thus Xi := (Xi t i, Ai )2 , . . .) is the increasing sequence of points of a unit 
Poisson process on R + , the T r 's are iid uniform [0, 1], and X\ and (T r , r = 1, 2, . . .) 
are independent. An atom (T r ,X lr ) G V will be understood as observation with 
value Xi >r , arrival time T r and final rank r. We define the initial rank of (T r , Xi >r ) as 
one plus the number of atoms in the open rectangle ]0, T r [ x ]0, A l r [. Note that the 
coordinate-wise ties among the atoms only have probability zero. 

To treat in a unified way both finite and infinite point configurations in the strip, 
we introduce the space X of all nondecreasing nonnegative sequences x — (xi, x 2 , ■ ■ .) 
where x r G IR+, with the convention that a sequence with finitely many proper 
terms is always padded by infinitely many terms oo. In particular, the sequence 
:= (oo, oo, . . .) is the sequence with no finite terms. The space X is endowed with 

OO 

the product topology inherited from R + . We denote xUx the nondecreasing sequence 
obtained by inserting x G M + in x, with understanding that x U oo = x. A strict 
partial order on X is defined by setting x -< y if x r < y r for r = 1, 2, . . . with at least 
one of the inequalities strict. Clearly, a3Ux^a?forx<oo. 

We regard X\ as the terminal state of a A'-valued process (X t ,t G [0, 1]), where 
X t is obtained by removing the entries A l r of Xi with T r > t. Clearly, X t is an 
increasing sequence of atoms of a Poisson process on M + with intensity measure tdx. 
For t G {T r } let X t , R t , It be the value, the final rank and the initial rank of the 
observation arrived at time t, respectively, and for t ^ {T r } let X t — B, f — I t — oo. 
We have X t = X t - Ulj, so X t = X t - unless t G {T r }. 

The process (X t , t G [0, 1]) is Markovian, with right-continuous paths, the initial 
state X = and the jump-times {T r } which comprise a dense subset of [0, 1]. Each 
component (A 4)i , t G [0, 1]) is a nonincreasing process, which satisfies X 0+ti = oo and 
changes its value at every i-record (observation of initial rank i). The jump-times of 
(X t t t, t G [0, 1]) are the arrival times of z-records; these occur according to a Poisson 
process of intensity t _1 dt independently for distinct i G N, as is known from the 
extreme-value theory. 

Define a stopping rule r to be a variable which may only assume one of the random 
values {T r }U{l}, and satisfies the measurability condition {r < t] G a(X s , s <t) for 
t G [0, 1]. The condition says that the decision to stop not later than t is determined 
by atoms V D ([0, t] x R + ) arrived within the time interval [0, t}. Such rules are called 
in [I~5l Definition 2.1] 'canonical stopping times'. 

We fix a nondecreasing nonnegative loss function q satisfying g(l) < q(oo). The 
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risk incurred by stopping rule r is assumed to be 



oo 



E[q(R T )] = Q{r) P(r = T r ) + g(oo) P(r = 1) 



(4) 



r=l 



where the terminal component is nonzero if and only if P(r = 1) > 0. Let T be the 
set of all stopping rules, and let V(J-) = mi Te:F K[q(R T )} be the minimal risk. 

The class TZ of rank rules is defined by a more restrictive measurability condition 
{r < t} G cr(I s , s <t) for t G [0, 1]. That is to say, by a rank rule the information of 
observer at time t amounts to the collection of arrival times on [0, t] of ^-records, for 
all j 6 N. The optimal stopping problem in 1Z is equivalent to 'the infinite secretary 
problem' in [10J. By [TOl Theorem 4.1] there exists an optimal rank rule of the form 
r = inf{i : I t < i(t)} (inf = 1), where t : [0, 1[— > N U {0} is a nondecreasing 
function. For instance, in the best-choice problem = l(t > e _1 ). 

A memoryless rule is a stopping rule of the form 



where / : [0, 1[— > R is a nondecreasing function. Denote Ai the class of memoryless 
rules, and denote V{M) = mf Tej vt E[g(i2 T )] its stopping value. One could consider 
a larger class of stopping rules by which the decision to stop depends only on the 
current observation. However, the following lemma, analogous to [TJ Lemma 2.1], 
shows that such extension of M. does not reduce the risk. 

Lemma 1. Let A C [0, 1] xR + be a Borel set. For the stopping rule r = inf {t : (t, X t ) G 

A} there exists a memoryless rule whose expected loss is not larger than that ofr. 

Proof. It is sufficient to consider sets A such that the area of A D ([0, t] x R + ) is 
finite for every t < 1. Indeed, if the area of A fl ([0, t] x R + ) is infinite for some 
s < 1 then r < s a.s., hence letting A' to be A fl ([0, s] x R + ) shifted by 1 — s to 
the right we obtain a rule not worse than r. Replace each vertical section of A by 
an interval adjacent to of the same length, thus obtaining subgraph of a function 
g. This preserves the distribution of the stopping rule and does not increase the risk, 
by the monotonicity of q. Break [0, 1] into intervals of equal size 6 and approximate 
g (in L 1 ) by a right-continuous function gg, constant on these intervals. Suppose on 
some adjacent intervals [t, t + S[, \t + 8, t + 25[ we have gg(t) > gs(t + 5). Let g' s be 
another piecewise constant function with exchanged values on these intervals, gs(t+S) 
and gs(t), but outside [t, t + 25] coinciding with g. Let V be the scatter of atoms 
obtained by exchanging the strips + xR + and [t + 5, t + 25[ xM + . Obviously, 

V — V. To compare two stopping rules r and r' defined as in (jSJ), but with g$ : 
respectively g' s , in place of /, we consider the selected atom (t,X t ) as a function of 
V, and consider (r',X r /) as a function of V . It is easy to see that X T = X T > unless 
([t + 8,t + 25 [ x [0, g(t + i5)])nP^0, whereas in the latter case X T < is stochastically 
smaller than X T . The advantage comes from the event that each of the strips contains 
an atom below the graph of gg. It follows that r' does better. Iterating this exchange 
argument, we see that the rule defined by gg is improved by a memoryless rule with 
a piecewise constant function. Letting <5 — ^ shows that one can reduce A to a 
subgraph of a monotonic / : [0, 1[— > R + . □ 



r = inf{t : X t < f(t)} (with inf = 1) 



(5) 
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Given the initial rank I t = i and the value X t = x of some observation at time t, 
the final rank of the atom (t, x) is i plus the number of atoms south-east of (t, x), the 
latter being a Poisson variable with parameter ix, where and henceforth 

i := 1 - t. 

By independence properties of V, the adapted loss incurred by stopping at (t, x) is 
equal to Q(ix,i), where 



Q(^):=X>(Oe^-^— (6) 



For instance, Q(ix,i) = 1 — e~ tx l(i = 1) in the best-choice problem, and Q(ix,i) = 
ix + i in Robbins' problem. The formula for Q is extended for infinite values of the 
arguments as Q(-, oo) = Q(oo, •) = q(oo). It is seen from the identity 

d [e*Q(g, 1)] in(p , 

that the series i) have the same convergence radius for all i. 

3. Memoryless rules and finiteness of the risk For r a memoryless rule (jSJ) 
with monotone /, denote £(/) = E [q(R T )} the expected loss. Introduce the integrals 

F(t) = f f{s) ds , S(x) = f dy = x/- 1 ^) - F{f-\x)) , 



JO JO 

where / _1 is the right-continuous inverse with f~ 1 (x) = for x < /(0). Note that 
P(r > t) = exp(— F(t)), and that given t = t < 1 the law of X T is uniform on [0, /(£)]. 
The formula for the risk follows by conditioning on the location of the leftmost atom 
below the graph of / and using the fact that the configurations of atoms above the 
graph and below it are independent: 

/•i rf(t) 

L{f)= e~ m dt Q{ix + S{x),l)dx + e~ F(1) g(oo). (7) 
Jo Jo 

Assuming that F(l) = oo, so the terminal part is 0, computation of the first 
variation of L(f) shows that an optimal / must satisfy a rather complicated functional 
equation: 

Q(f(t)-F(t),l)= (8) 

/•i r /•/(*) r/w 

/ exp(F(t) - F(s))ds / Q(S(x) + xs, l)dx + / Q(S(x) + xs, 2)d 
Jt Jo J f(t) 



A rough upper bound 

-i /■/(*) 



^(/) < / e~ F{t) dt / g(x, 1) da; + e" F(1) g(oo) (9) 
Jo Jo 

follows from ix + S(x) < x. 
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The bound (jHJ) is computable for the loss functions 



q (r) = {r - l)(r - 2) ■ ■ ■ {r - £) (£eN), (10) 
in which case we have a very simple formula Q(£,l) = £ , and becomes 

L{f)<{£+\y l I e- F ®f(t) e+1 dt. 



Solving the variational problem for F with boundary conditions F(0) = 0, F(l) = oo, 
we see that the minimal value of the right-hand side is (£ + 1) £ , which is attained by 
the function f(t) = (£ + 1)/(1 - t). 

It is instructive to directly analyse the memoryless rules with hyperbolic threshold 

fb(t) :=^- t (b> 0) 

and q as in (fT0|) . We calculate e~ F ^ = (1 — t) h and S{x) = (x — b — blog(x/b)) (for 
x > /(0) = 6). For £ = 1 integrating by parts in ([7]) we obtain 

L (fb) = I + yf^p (11) 

which is finite for all 6 > 1, with the minimum 1.3318 • • • attained at b = 1.9469 • • • 
(which agrees with PQ Example 4.2] where the minimum is 2.3318 • ■ ■ for the linear 
loss q(r) = r). For i = 2 

fr 3 2(b* - 2b" + 2fr 2 + 6b - 4) 
UbJ 3 + (6-2)(6-l) 2 (6+l)(6 + 2)' 1 j 

which is finite for all b > 2, with minimum 4.4716 • • • at b = 2.96439 • • • . Formulas 
become more involved for larger £, a common feature being that £(/&) < oo for 
b > £. For £ = 3, the minimum is 24.8061 at 3.9734 • • • . For £ = 4, the minimum is 
194.756 • • • at b = 4.979 • • • . The upper bound becomes 



L(f b ) < / (l-tf / x e dx 
Jo Jo 



(£+l)(b-£Y 

which attains minimum at b = £ + 1 in agreement with what we have obtained above. 

Remark. Notably, the memoryless rule with threshold fg + i is overall optimal in the 
related stopping problem E[(X T ) ] — > inf, for arbitrary £ > 0. For £ = 1 we face 
here a variant of 'Moser's problem' associated with V (see p], El EE] and references 
therein) . 

It can be read from [3j [H [7] that for the linear loss q(r) = r we have V(.M) = 
inf L(f) < V(K) = 3.8695 

The minimiser of L(f) is not known explicitly, but some approximations to it can 
be read from [lj (where they appear in the course of asymptotic analysis of the finite-n 
Robbins' problem). We did not succeed to solve (|Sj) even for the best choice problem, 
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although there is a simple suboptimal rule with constant threshold f(t) = 1.503 • • • 
achieving L(f) = 1 — 0.517 ■ • • (to be compared with the value V^J 7 ) = 1 — 0.580 • ■ ■ , 
see PH p. 682]) hence beating the rank rules: V(M) < V{TZ) = 1 - 0.368 

It would be interesting to know for which q the memoryless rules outperform the 
rank rules and if it is possible, for unbounded q, to have the memoryless risk finite 
while infinite for the rank rules. We sketch some results in this direction. From 
the above elementary estimates V(M) < oo provided q(r) < cr l for some constants 
c > 0, £ > 0. For such q the risk of rank rules is also finite. Moreover, Mucci [T7] 
p. 426] showed that for the loss function q(r) = r(r + 1) ■ • ■ (r + £ — 1) (f GN) the 
minimum risk of rank rules is 



(which extends the £ = 1 result from [7]). For £ = 2 the formula yields 33.260 ■ ■ • , 
while the /&-rules do worse, with mf bL(fb) = 38.068 • • • (as computed from ( fill and 
( |T2l) using the linearity of L(f) in q). 

In fact, V(M) < oo for many loss fuctions growing much faster than polynomials. 

Proposition 2. If q(r) < cexp(x^) for some c > and < (3 < 1 then V(Ai) < oo. 

Proof. The risk is finite for the memoryless rule with f(t) = (1 — t)~ a for any a > 
(1 — 0)~ x . To see this, use the bound © and formulas 



which also imply that for this rule P(r = 1) = 0. Now E[exp((X T ) /3 ))] is estimated 



However, the risk is infinite for any stopping rule if q grows too fast. The following 
result is an analogue of PHI Proposition 5.3] for rank rules. 

Proposition 3. If Q(b, 1) = oo for some b G IR+ then V^(JF) = oo, i.e. there is no 
stopping rule r G T with finite risk. 

Proof. Choose any x with S(x) = x — b — blog(x/b) > b. The conditional loss 
by stopping above is infinite, thus we can only consider stopping rules r which 
never do that and satisfy P(r = 1) = 0. On the other hand, on the nonzero event 
{V fl {(t,y) : y < min(x, f(t))} = 0} stopping occurs at some atom (s,z) with 
s > 1 — b/x, z > x, and averaging we see that the expected loss is infinite. □ 

Remark By [TUl Section 5], V{1Z) = oo if 'Yl, r {\ogq{r)) /r 2 = oo. For instance, the 
loss structure q(r) = e r implies that the risk of rank rules is infinite. It is not known 
if the risk of rank rules is finite for q(r) = exp(ar) with < (3 < 1. 

For the sequel we assume that the loss function satisfies 





from asymptotics of the incomplete gamma function. 



□ 



lim sup 



q{r + 1) 
q{r) 



C. 



(13) 
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with some constant C > 0. The assumption implies that Q(x,i) < oo for all finite 
x, i. Another consequence is that E[g(i? T )] < oo implies E[g(i? r + N)] < oo for N 
either a fixed positive integer or a Poisson random variable, independent of r. 

Lemma 4. IfE,[q(R T ) | X = x] < oo then E,[q(R T ) | X = x'\ is finite and continu- 
ous in x, where x' is either xU x or (x\ + x, x-i +x, . . .). 

Proof. As x changes to some x' , the outcome R T can only change if there is an atom 
between x and x', which occurs with probability about \x — x'\ when x, x' are close. 
Conditionally on this event, the change of expected loss is bounded in consequence 
of (USD. □ 



3. Properties of the optimal rule The optimal stopping problem in T is a 
problem of Markovian type, associated with the time-homogeneous Markov process 
((X t ,I t ), t G [0,1]), with state-space X x N and time-dependent loss Q(iX t ,It) for 
stopping at time t. If I t assumes some finite value i then t G {T r } and X t ^ = X t , which 
combined with the fact that ranking of the arrivals after t depends on VC\ ([0, t] x M + ) 
through X t shows that (X t , I t ) indeed summarises all relevant information up to time 
t. We choose (X t ,I t ) in favour of (probabilistically equivalent) data (X t -,X t ) since 
Xi is well-defined as a function of (x,i) even if x has repetitions. 

Following a well-known recipe, we consider a family of conditional stopping prob- 
lems parametrised by (t, x). This corresponds to the class of stopping rules r > 
t, t G T that operate under the condition X t = x. The effect of the conditioning 
is that each x r < X T contributes one unit to R T in the event r < 1 . The variable 
t can be eliminated by a change of variables which exploits the self- similarity of V 
(a property which has no analogue in the finite-n setting): for t g]0,1[ fixed, the 
affine mapping (s, x) i— > ((s — t)/i,xt) preserves both the coordinate-wise order and 
the Lebesgue measure, hence transforms the point process P (1 ([t, 1] x R + ) into a 
distributional copy of V with the same ordering of the atoms. Thus we come to the 
following conclusion: 

Lemma 5. The stopping problem from time t on with history x is equivalent to the 
stopping problem starting with X = ix at time 0. 

Let v(x) be the minimum risk given X = x. The function v, defined on the 
whole of X, satisfies a lower bound 

oo 

v(x) > ^<?(r)(e- :tv - 1 - e - *-) (x = 0), (14) 

r=l 

which is strict if the series converges (the bound is a continuous-time analogue of the 
finite-n 'half-prophet' bounds in [U Lemma 3.2]). The bound follows by observing 
that X T cannot exceed the smallest value arrived on [0, 1]. 

If V{J r ) = oo then, of course, v(x) = oo everywhere, but for arbitrary unbounded 
q there exists a dense in X set of sequences x = (x r ) for which x r j oo so slowly that 
v(x) = oo. Thus if q(oo) = oo, the function v is discontinuous at every point where 
it is finite. If q is truncated at m, then clearly v depends only on the first m — 1 
components of x and satisfies v(x) < q(m). Let = (0, 0, . . .). 
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Lemma 6. The following hold: 

(i) v(x) < oo implies that v(x U x) is finite and continuous in x, 

(ii) if q{po) < oo then v is continuous, and satisfies v{x) < q{po) for x\ > 0. 

(iii) v(x) — > q(oo) as x — > 0. 

Proof. Let r be e-optimal under the initial configuration x U x. Applying r under 
x Ux', Lemma H] implies that v(xUx') < v(xUx) + e. Changing the roles of x, x' and 
letting e — > yield (i). The continuity of t> follows directly from (i) if q is truncated at 
some m. The general bounded case follows by approximation as m — > oo. Assertion 
(iii) can be derived from f)14p . □ 

Lemma 7. If q is not truncated then 

(i) Q(x,i) is strictly increasing in both x andi, 

(ii) x <y implies v(x) < v(y) provided these are finite, 

If q is truncated at m and q{m — 1) < q{m) then (i) is valid only for i G [m], 
Q(x, i) = q(m) = q(oo) fori > m, and a counterpart of (ii) holds for the order defined 
on the first m — 1 components, with v(x) < q(m) for all x G X with x m _\ > 0. 

Proof. Assertion (i) follows from ([6]) and the monotonicity of q. For (ii), observe that 
x -< y implies : Xi < x} > : yi < x} for all x > 0. Hence for every rule r the 
stopped final rank under X = x cannot increase when the condition is replaced by 
= y 

□ 

Let i(x, x) := #{r : x r < x} and suppose x satisfies < x\ < X2 < . • • < oo. 
Applying Lemma d we see that if q is not truncated then the function Q(x, i(x,x)) is 
strictly increasing in x from q(l) to q(oo). If q is truncated at m and q(m — 1) < q(m) 
then Q(x, i(x, x)) is strictly increasing as x varies from to x m _i, with Q(x, i(x, x)) = 
q(m) for x > x m -i- On the other hand, [x U x) -< (xU y) for x < y, hence v(x U x) 
is nonincreasing in x. Thus introducing 

h(x) := sup{x : Q(x, i(x, x j) < v(x Ui)} 

we have Q(x,i(x,x)) < v(x U x) for x < h(x), and Q(x,i(x,x)) > v(x U x) for 
x > h(x). Subject to obvious adjustments, the definition of h(x) makes sense for 
every x ^ in the untruncated case, and for x m _i > in the truncated. 
We are ready to show that memoryless rules are not optimal. 

Proposition 8. IfV(F) < oo then V{F) < V(M). 

Proof. For a memoryless rule with threshold function / to be optimal, we must have 
v{tX t ) < Q(tX t ,i(X t ^,X t )) for X t > f{t), and v(iX t ) > Q(iX t , i(X t ., X t ))_for 
X t < f(t), because otherwise the rule can be improved. This forces fit) = h(tx), 
which does not hold since h is not constant. 

To demonstrate concretely how a memoryless rule with threshold / can be im- 
proved let us apply the same idea as in [U Section 5]. Assume q(oo) = oo. Sup- 
pose (t, x) is above the graph of /, hence should be skipped by the memoryless 
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rule. Let i = i(x,x) be the initial rank under history x. Varying finitely many of 
the components x r (r > i) we can achieve that the bound (fl4l) be arbitrarily large 
while the expected loss of stopping remains unaltered Q(ix,i). For such x we have 
v(t(x U x)) > Q(x,i(x,x)) hence stopping strictly reduces the risk on some event of 
positive probability. □ 

Based on the function h : X — > K + , we construct a predictable process 

H t := h(X t - \ {X 1>r :T r <t, X ljT < h(X Tr „)}) (t G [0, 1]). 

Let Y t be a thinned sequence obtained by removing the terms in {■ • ■ } from X t -, so 
H t = h(Y t ). Intuitively, H t is a history-dependent threshold which depends on the 
configuration of atoms X t _ that arrived on [0,t[ and are above the curve (H s , s e 
[0,t[). As t starts increasing from 0, the process H t coincides with h(Xt~) as long 
as there are no atoms below the threshold, while at the first moment this occurs the 
atom is discarded, and does not affect the future path of the process. 

Remark The reason for thinning V is that we wish to see (Ht) as an increasing 
process defined for all t, as opposed to considering h(X t -) killed as soon as the 
threshold is undershoot. 

We list some properties of (Ht) which follow directly from the definition and Lem- 
mas M and (under X = 0). 

Lemma 9. (i) (H t ) is nondecreasing on [0, 1[. 

(ii) IfV(T) < oo then H is the unique root ofQ(x,l) = v(xU oo). 

(iii) H\_ = Y l m _ 1 if q is truncated at m and q(m — 1) < q(m). 

(iv) Hi_ = oo if q is not truncated. 

To gain some intuition about the behaviour of (H t ) we shall gradually increase the 
complexity of loss function. In the simplest instance of the best-choice problem, v 
depends only on X\ (see [T^l Equations (8) and (13)]) and there is an explicit formula 
for threshold 

H t = mm(f b (t),Y t>1 ) (b = 0.804 • ■ ■ ). 

That is to say, as t starts increasing from 0, H t is a deterministic drift process until 
it hits the level of the lowest atom above the graph. The drift is hyperbolic due to 
self-similarity of V (Lemma [5]). After this random time, H t has a flat, which appears 
because it is never optimal to stop at observation with initial rank 2 or larger. On 
the first part of the path H t satisfies Q(H t , 1) = v(t(Y t U Ht)), and on the second 
Q(H t ,l)<v(t(Y t UH t )). 

If q is strictly truncated at m = 3, meaning that q(2) < q(3) = q(oo), a new 
effect appears. For t sufficiently small, as long as H t < Y t> i each 1-record above the 
threshold causes a jump, because v(tY t ) jumps and the threshold must go up to 
compensate. Thus (Ht) has both drift and jump components. The jump locations 
are the 1-record times accumulating near at rate t _1 dt. As H t hits Y t ^, there is a 
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possible flat, then a period of deterministic drift where Q(H t , 2) = v(i(Y t U H t )), and 
finally there is a flat at some level Y t ^ (then Y t p = Yip)- 

For q strictly truncated at m > 3, the jump locations are included in m — 2 record- 
time processes of atoms with initial rank at most m — 2, there are m — 1 potential 
flats and a drift component between the flats. We do not assert that the number of 
flats is always exactly m — 1, because it is not at all clear if (H t ) can break a level 
Y t) r for r < m — 1 by jumping through it, hence sparing a flat. 

Now suppose that q is not truncated and that H t < oo everywhere on [0, 1[ with 
probability one. Then, outside the union of flat intervals, every arrival above H t 
causes a jump, thus the set of jump locations is dense there. The number of flats 
may be infinite, and outside the flats Q(H t , i(Y t , H t )) = v(i(Y t U H t )). 

In the case of Robbins' problem, we have by linearity of the loss Q(x,i + 1) — 
Q(x, i) = 1 and v(x U x) — v(x) < 1 (if v(x Ui) < oo). Thus Q(x, i(x, x)) = v(x U x) 
implies Q(x,i(x,x) + 1) > v(x U x U x') for arbitrary x' . But this means that (H t ) 
cannot cross any Y tt i by a jump. It follows that (Ht) has infinitely many flats at all 
levels Yi >r (reft). The presence of all three effects (drift, jumps and flats) and the 
lack of independence of increments property all leave a little hope for a kind of more 
explicit description of (H t ). 

The optimality principle requires stopping at atom (t, x) when the history X t - = x 
satisfies Q(ix,i(x,x)) < v(tx), whence the following analogue of (j^J). 

Proposition 10. IfV^) < oo then H t < oo a.s. for allt < 1 and the stopping rule 

r* := inf{t : X t < H t } (inf = 1) 

is optimal in T . 

Proof. For bounded q a general result [21, Theorem 3, p. 127] is applicable since the 
function Q(x,i(x,x)) is bounded and continuous on X x N. 

Alternatively, for q truncated at some m one can use results of the optimal stopping 
theory for discrete-time processes. To fit exactly in this framework, focus on the 
sequences of i-records (for % < m — 1) that arrive on [e, 1], and then let e — > 0. The 
general bounded case follows in the limit m —>■ oo. 

For unbounded q we use another kind of truncation (analogous to that in j3l Section 
4]). For m fixed, let Q( m '(x,i) = Q(x,max(i,m)) and consider the stopping problem 
with loss Q {m) (tx,i (x, x) for stopping at (t, x) with history x. This corresponds 
to ranking x relative to at most m atoms before t, but fully accounting all future 
observations below x. In this problem it is never optimal to stop at atom with relative 
rank m or higher. Indeed, stopping at (t, x) with such rank can be improved by 
continuing and then exploiting any hyperbolic memoryless rule with b <ix (stopping 
is guaranteed before 1 since the subgraph of /& has infinite area). By discrete-time 
methods, optimality of the rule r ( - m - ) = inf{t : X t < H^} in the truncated problem 
is readily acquired, with a nondecreasing predictable process (H^) defined through 
hS m >(x) := sup{x : Q^(x,i(x,x)) < v^ m \x U x)}, where is the minimum loss 
analogous to v. Obviously, Q^ m '(x,i(x,x)),v^ m '(x) is nondecreasing in m. 

A decisive property of this kind of truncation is that Q^ m '(x, i) = Q(x, i) for m > i. 
This implies that H^ is eventually nondecreasing in m and there exists a pointwise 
limit H[ = lim m ^oo H^ , which defines a legitimate stopping rule r' as the time 
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of the first arrival under H' . Denote for shorthand L(r) = E[Q(X r , I r )], £A m )(r) = 
E[Q( m )(X T ,/ T )] and denote u,u^ m ' the minimum risks (so u = V^F)). Trivially, 
limm-^oo vS m ' < u. On the other hand, by monotone convergence L^ m \r') f L(r) > u. 
If follows that < u and r' is optimal. The convergence v^ mS) (x) f v(x) is shown 
in the same way, from which H' t = H t and r' = r* is optimal. □ 



Remark. Assumption (fl3|) limits, by the virtue of LemmaHl the risks of all stopping 
rules under various initial data, while we are really interested only in the properties 
of optimal or e-optimal rules. We feel that Proposition [TU] is still valid under the sole 
condition V(F) < oo, but history dependence makes proving this more difficult than 
in the analogous situation with rank rules [TO] . 



As a by-product, we have shown that the risk in the truncated problem with loss 
function q(mm(r, m)) converges to V{J-). Indeed, the loss is squeezed between the 
loss in the modified truncated problem and the original untruncated loss. 

From the formula for the distribution of the optimal rule, 



P(r* > t) = E 



exp 



H s ds 



and arguing as in Lemma[T]we see that H t cannot explode at some t < 1 if V(F) < oo. 
The risk can be bounded from below in the spirit of ([7]) as 



E[q{R T *)\ > E 



exp 



- / HAs 



o 



Hi 



Q(tx, <pH(x))dx 



where <Ph{x) is the number of flats of (H t ) below x. If the loss function q has the 
property that the flats of (H t ) occur at all levels r 6 N (like in Robbins' 

problem) the equality holds. The same kind of estimate is valid for every stopping 
rule t defined by means of an arbitrary nondecreasing predictable process like {H t ). 



4. The infinite Poisson model as a limit of finite-n problems To connect the 
finite-n problem with its Poisson counterpart it is convenient to realise iid sequence in 
the following way [9], [HI H3] . Divide the strip [0, 1] xR + in n vertical strips of the same 
width 1/n. Let Xj be the atom of V with the lowest x- value. By properties of the 
Poisson process, X 1: . . . ,X n are iid with exponential distribution of rate 1/n. Note 
that optimal stopping of X 1; . . . , X n is equivalent to optimal stopping of V with the 
lookback option allowing the observer to return to any atom within a given 1 / n-strip 
(equivalently, at time (j — l)/n to foresee the configuration of atoms up to time j/n). 
This embedding in V immediately implies V^(jF n ) < V(T\ Moreover, as n — > oo, 
each i-record process derived from Xx, . . . , X n converges almost surely to the z-record 
process derived from V. From this one easily concludes, first for truncated then for 
any bounded q, that Voq^J 7 ) = V(J-), where V^J 7 ) = \im n ^ 00 V n (J-' n ) as defined in 
Introduction. 

For the general q, the relations 

V^T) = V(F), VooiK) = V(K), VeoiM) = V(M) 



12 



follow (as in [H [2J [U [7J [9j [16]) from that in the truncated case, by combining mono- 
tonicity of risks in the truncation parameter m with the monotonicity in n stated in 
the next lemma. 

Lemma 11. Vn^F^^niTZ^iVn^Ain) are increasing with n. 

Proof. This all is standard, see the references above. We only add small details to pQ 
Theorem 2.4] for the .M-case. Let r be an optimal memoryless rule in the problem 
of size n + 1, and let r' be a modified memoryless strategy which always skips the 
worst value X n+ i >n+ i but otherwise has the same thresholds as r. (To apply r' the 
observer must be able to recognise X n+ i iTl+ i as it arrives.) Then r' strictly improves 
r in the event that r stops at X n+l n+1 . On the other hand, strategy r' performs as 
a mixture of memoryless rules in the problem of size n, because given X n +i^ n +i — x 
the other Xj's are iid uniform on [0,x]. Therefore V n (A4 n ) < V n+ i(M. n+ i). □ 
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